SARIMAX (Seasonal ARIMA with Exogenous Variables)#
SARIMAX extends SARIMA by adding a regression on external variables (a.k.a. exogenous regressors).
This notebook focuses on:
how exogenous variables enter the model
causal vs correlational interpretation (and what “counterfactual” means here)
practical requirements (alignment, scaling, forecast-time availability)
an end-to-end demo with NumPy + statsmodels + Plotly
Learning goals#
By the end you should be able to:
write SARIMAX in compact operator notation
explain how
exogaffects the conditional mean and how SARIMA handles residual autocorrelationlist practical requirements for exogenous variables (alignment, leakage, availability)
visualize the impact of exogenous variables and run scenario / counterfactual-style forecasts
implement a simple NumPy two-stage approximation (OLS + seasonal AR errors)
1) Model formulation (regression + SARIMA errors)#
Let \(y_t\) be the target series and \(\mathbf{x}_t \in \mathbb{R}^k\) be the vector of exogenous variables available at time \(t\).
Regression (mean) part
where \(u_t\) is an autocorrelated error.
SARIMA error part
Let \(B\) be the backshift operator \(B y_t = y_{t-1}\). Define:
Then the seasonal ARIMA(\(p,d,q\))×(\(P,D,Q\))\(_s\) model for \(u_t\) is:
Putting both together yields SARIMAX.
Equivalently, you can write the combined model as:
Key intuition
\(\beta_0 + \mathbf{x}_t^\top\boldsymbol{\beta}\) explains the part of \(y_t\) driven by observed external signals.
SARIMA explains the remaining autocorrelation in the residual \(u_t\).
2) How exogenous variables are incorporated (precisely)#
In SARIMAX, exogenous variables enter the observation equation as a linear regression.
In statsmodels, exog is a matrix \(X\) whose \(t\)-th row is \(\mathbf{x}_t^\top\). The coefficients \(\boldsymbol{\beta}\) are estimated (typically) jointly with ARIMA parameters via maximum likelihood using a state-space model and the Kalman filter.
Practical patterns:
Contemporaneous effects: include \(x_t\).
Lagged effects: include \(x_{t-1}, x_{t-2}, \dots\) explicitly as extra columns.
Event indicators / interventions: use 0/1 dummies (policy change, promotions, outages).
Deterministic seasonality/trend: can be included either via SARIMA terms or as regressors (seasonal dummies, Fourier terms), depending on preference.
Interpretation of \(\beta_j\): holding the ARIMA error structure fixed, increasing \(x_{t,j}\) by 1 changes the model’s conditional mean for \(y_t\) by \(\beta_j\) units (in the scale of your transformed variables).
Econometric exogeneity (strong form) is often written as \(\mathbb{E}[\varepsilon_t \mid X_{1:T}] = 0\); if \(\mathbf{x}_t\) is correlated with the innovation \(\varepsilon_t\) (endogeneity), \(\hat{\boldsymbol{\beta}}\) is biased.
3) Causal vs correlational usage (and real-world examples)#
Predictive / correlational (the default)#
In most forecasting settings, SARIMAX is used to estimate conditional associations:
This is often all you need for accurate forecasts.
When can coefficients be interpreted causally?#
A causal interpretation (“if we intervene on \(x\), \(y\) will change”) requires assumptions beyond SARIMAX:
No reverse causality: \(y\) does not influence \(x\) (or you model that feedback explicitly).
No omitted confounders: nothing unobserved drives both \(x\) and \(y\).
Correct timing: \(\mathbf{x}_t\) must be known at/ before time \(t\) (no look-ahead).
Structural stability: the relationship is stable under the intervention you care about.
Without these, a SARIMAX “counterfactual” is a model-based scenario, not a causal estimate.
Examples#
Finance (predictive): forecasting volatility/returns using exogenous signals such as VIX, macro announcements, funding rates, or regime indicators. Be cautious: market variables frequently have feedback loops → endogeneity.
Economics (scenario analysis): forecasting inflation with oil prices and policy rates; demand with prices/promotions; unemployment with job openings. For policy effects, identification strategies (IV, DiD, SVAR, etc.) are typically needed.
4) Requirements for exogenous variables (alignment, scaling, leakage)#
Minimal practical requirements (for forecasting):
Same time granularity as \(y_t\) (or resampled/aggregated correctly).
Correct time alignment: the row \(\mathbf{x}_t\) must only use information available at time \(t\) (or earlier).
No unexpected missingness; handle gaps explicitly (impute, drop, or model).
Reasonable collinearity (high collinearity makes \(\boldsymbol{\beta}\) unstable).
For future forecasting, \(\mathbf{x}_{t+h}\) must be known or scenarized for the horizon.
Alignment checklist (avoid leakage):
Join \(y\) and \(X\) on the same timestamp index; confirm lengths after cleaning.
If a regressor is published with delay (common in macro data), shift it so \(\mathbf{x}_t\) reflects what was known at time \(t\).
For lagged effects, create explicit lag columns (e.g.,
x.shift(1)); avoidshift(-1)features unless you truly know the future.Decide how to handle missing timestamps (drop vs forward-fill) based on the meaning of the variable.
Scaling / transforms:
Standardize continuous regressors using training mean/std and apply the same transform to future
exog.Keep binary indicators (0/1 dummies) unscaled in most cases.
Scaling \(x\) does not change forecasts if you transform coefficients consistently, but it can improve numerical stability of MLE.
Interpret \(\boldsymbol{\beta}\) in the scale of your transformed variables (log, diff, percent).
Lag specification:
If effects are delayed, include lags explicitly; SARIMA terms capture autocorrelation, not causal delay in \(x \to y\).
Non-stationary series:
Differencing is applied to the ARIMA error part; if \(y\) and \(x\) are trending, you may need trends, differences, or cointegration-aware modeling to avoid spurious relationships.
import warnings
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import os
import plotly.io as pio
import statsmodels.api as sm
warnings.filterwarnings("ignore", category=UserWarning, module="joblib")
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
SEED = 7
rng = np.random.default_rng(SEED)
import numpy, pandas, statsmodels, plotly
print("numpy", numpy.__version__)
print("pandas", pandas.__version__)
print("statsmodels", statsmodels.__version__)
print("plotly", plotly.__version__)
numpy 1.26.2
pandas 2.1.3
statsmodels 0.14.4
plotly 6.5.2
# --- Synthetic data with known exogenous effects ---
n = 180
s = 12 # monthly seasonality
idx = pd.date_range("2010-01-01", periods=n, freq="MS")
# Exogenous variables (think: policy rate + a temporary intervention)
rate = np.zeros(n)
rate[0] = 2.0
for t in range(1, n):
rate[t] = 0.9 * rate[t - 1] + 0.2 + rng.normal(0, 0.15)
stimulus = np.zeros(n)
stimulus[60:75] = 1.0
stimulus[120:132] = 1.0
# Seasonal AR errors: u_t = phi u_{t-1} + Phi u_{t-s} + eps_t
phi = 0.5
Phi = 0.3
sigma = 1.5
u = np.zeros(n)
eps = rng.normal(0, sigma, size=n)
for t in range(n):
ar1 = phi * u[t - 1] if t - 1 >= 0 else 0.0
sar1 = Phi * u[t - s] if t - s >= 0 else 0.0
u[t] = ar1 + sar1 + eps[t]
# True regression effects
beta0_true = 50.0
beta_rate_true = -4.0
beta_stimulus_true = 8.0
y = beta0_true + beta_rate_true * rate + beta_stimulus_true * stimulus + u
df = pd.DataFrame({"y": y, "rate": rate, "stimulus": stimulus}, index=idx)
df.head()
| y | rate | stimulus | |
|---|---|---|---|
| 2010-01-01 | 41.949314 | 2.000000 | 0.0 |
| 2010-02-01 | 41.312201 | 2.000185 | 0.0 |
| 2010-03-01 | 40.714617 | 2.044978 | 0.0 |
| 2010-04-01 | 42.394950 | 1.999359 | 0.0 |
| 2010-05-01 | 42.280054 | 1.865835 | 0.0 |
# Plot the target and exogenous drivers
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
go.Scatter(x=df.index, y=df["y"], name="y (target)", line=dict(color="black")),
secondary_y=False,
)
fig.add_trace(
go.Scatter(x=df.index, y=df["rate"], name="rate (exog)", line=dict(color="#4E79A7")),
secondary_y=True,
)
fig.add_trace(
go.Bar(x=df.index, y=df["stimulus"], name="stimulus (exog)", opacity=0.25, marker_color="#E15759"),
secondary_y=True,
)
fig.update_layout(
title="Synthetic series with exogenous drivers",
xaxis_title="Date",
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0),
)
fig.update_yaxes(title_text="y", secondary_y=False)
fig.update_yaxes(title_text="rate / stimulus", secondary_y=True)
fig.show()
# --- Fit SARIMAX (with exog) vs SARIMA (no exog) on a holdout split ---
h = 24
train = df.iloc[:-h]
test = df.iloc[-h:]
exog_cols = ["stimulus", "rate"]
order = (1, 0, 0)
seasonal_order = (1, 0, 0, 12)
res_x = sm.tsa.SARIMAX(
train["y"],
exog=train[exog_cols],
order=order,
seasonal_order=seasonal_order,
trend="c",
enforce_stationarity=True,
enforce_invertibility=True,
).fit(disp=False, method="lbfgs", maxiter=500)
res_0 = sm.tsa.SARIMAX(
train["y"],
order=order,
seasonal_order=seasonal_order,
trend="c",
enforce_stationarity=True,
enforce_invertibility=True,
).fit(disp=False, method="lbfgs", maxiter=500)
fcst_x = res_x.get_forecast(steps=h, exog=test[exog_cols])
fcst_0 = res_0.get_forecast(steps=h)
pred_x = fcst_x.predicted_mean
pred_0 = fcst_0.predicted_mean
def rmse(y_true, y_pred):
y_true = np.asarray(y_true)
y_pred = np.asarray(y_pred)
return float(np.sqrt(np.mean((y_true - y_pred) ** 2)))
def mae(y_true, y_pred):
y_true = np.asarray(y_true)
y_pred = np.asarray(y_pred)
return float(np.mean(np.abs(y_true - y_pred)))
print("Test RMSE (with exog):", rmse(test["y"], pred_x))
print("Test RMSE (no exog): ", rmse(test["y"], pred_0))
print("Test MAE (with exog):", mae(test["y"], pred_x))
print("Test MAE (no exog): ", mae(test["y"], pred_0))
res_x.summary()
Test RMSE (with exog): 1.6851935314621316
Test RMSE (no exog): 1.7018128229455727
Test MAE (with exog): 1.370597378351971
Test MAE (no exog): 1.4392971938785968
| Dep. Variable: | y | No. Observations: | 156 |
|---|---|---|---|
| Model: | SARIMAX(1, 0, 0)x(1, 0, 0, 12) | Log Likelihood | -276.615 |
| Date: | Sat, 31 Jan 2026 | AIC | 565.229 |
| Time: | 12:35:10 | BIC | 583.528 |
| Sample: | 01-01-2010 | HQIC | 572.661 |
| - 12-01-2022 | |||
| Covariance Type: | opg |
| coef | std err | z | P>|z| | [0.025 | 0.975] | |
|---|---|---|---|---|---|---|
| intercept | 18.2550 | 3.754 | 4.863 | 0.000 | 10.897 | 25.613 |
| stimulus | 7.3388 | 0.475 | 15.462 | 0.000 | 6.409 | 8.269 |
| rate | -4.3796 | 0.666 | -6.577 | 0.000 | -5.685 | -3.074 |
| ar.L1 | 0.5277 | 0.067 | 7.872 | 0.000 | 0.396 | 0.659 |
| ar.S.L12 | 0.2335 | 0.107 | 2.183 | 0.029 | 0.024 | 0.443 |
| sigma2 | 2.0176 | 0.229 | 8.829 | 0.000 | 1.570 | 2.466 |
| Ljung-Box (L1) (Q): | 0.33 | Jarque-Bera (JB): | 2.59 |
|---|---|---|---|
| Prob(Q): | 0.56 | Prob(JB): | 0.27 |
| Heteroskedasticity (H): | 0.88 | Skew: | -0.30 |
| Prob(H) (two-sided): | 0.65 | Kurtosis: | 3.21 |
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
# Plot holdout forecasts
ci_x = fcst_x.conf_int()
lower_x = ci_x.iloc[:, 0]
upper_x = ci_x.iloc[:, 1]
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=train.index,
y=train["y"],
name="train y",
line=dict(color="rgba(0,0,0,0.35)"),
)
)
fig.add_trace(go.Scatter(x=test.index, y=test["y"], name="test y", line=dict(color="black")))
fig.add_trace(
go.Scatter(
x=test.index,
y=pred_0,
name="forecast: SARIMA (no exog)",
line=dict(color="#F28E2B"),
)
)
fig.add_trace(
go.Scatter(
x=test.index,
y=pred_x,
name="forecast: SARIMAX (with exog)",
line=dict(color="#4E79A7"),
)
)
fig.add_trace(
go.Scatter(x=test.index, y=lower_x, showlegend=False, line=dict(color="rgba(78,121,167,0.0)"))
)
fig.add_trace(
go.Scatter(
x=test.index,
y=upper_x,
showlegend=False,
line=dict(color="rgba(78,121,167,0.0)"),
fill="tonexty",
fillcolor="rgba(78,121,167,0.15)",
)
)
fig.add_vline(x=test.index[0], line_dash="dash", line_color="black")
fig.update_layout(
title="Forecast comparison (holdout): SARIMA vs SARIMAX",
xaxis_title="Date",
yaxis_title="y",
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0),
)
fig.show()
Note on the intercept in ARMA/SARIMA#
In many ARMA/SARIMA parameterizations, the reported intercept is the ARMA constant \(c\), not the unconditional mean \(\mu\).
For an AR(1)×seasonal AR(1) model, the implied mean offset is:
The chart below uses this implied \(\mu\) when drawing \(\mu + X\hat\beta\).
# --- Plotly chart: estimated impact of exogenous variables ---
params = res_x.params.copy()
beta_hat = params.reindex(exog_cols).astype(float)
intercept = float(params.get("intercept", params.get("const", 0.0)))
phi1 = float(params.get("ar.L1", 0.0))
Phi1 = float(params.get("ar.S.L12", 0.0))
phi_at_1 = (1.0 - phi1) * (1.0 - Phi1)
mu = intercept / phi_at_1 if phi_at_1 != 0.0 else intercept
contrib = df[exog_cols].to_numpy() * beta_hat.to_numpy() # (n, k) * (k,)
total_contrib = contrib.sum(axis=1)
mean_part = mu + total_contrib
fig = make_subplots(
rows=2,
cols=1,
shared_xaxes=True,
vertical_spacing=0.07,
subplot_titles=(
"Observed y vs estimated exog-driven mean (μ + Xβ)",
"Estimated contribution of each exogenous variable",
),
)
fig.add_trace(go.Scatter(x=df.index, y=df["y"], name="y", line=dict(color="black")), row=1, col=1)
fig.add_trace(
go.Scatter(x=df.index, y=mean_part, name="μ + Xβ", line=dict(color="#4E79A7")),
row=1,
col=1,
)
for j, col in enumerate(exog_cols):
fig.add_trace(
go.Scatter(x=df.index, y=contrib[:, j], name=f"{col} contribution"),
row=2,
col=1,
)
fig.add_trace(
go.Scatter(
x=df.index,
y=total_contrib,
name="total Xβ",
line=dict(color="rgba(0,0,0,0.6)", dash="dash"),
),
row=2,
col=1,
)
fig.update_yaxes(title_text="y", row=1, col=1)
fig.update_yaxes(title_text="contribution", row=2, col=1)
fig.update_layout(
title="Impact of exogenous variables (estimated from SARIMAX fit)",
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0),
)
fig.show()
beta_hat
stimulus 7.338761
rate -4.379605
dtype: float64
# --- Plotly chart: counterfactual-style forecast scenarios ---
# IMPORTANT: this is a model-based conditional scenario, not necessarily a causal effect.
exog_base = test[exog_cols].copy()
exog_cf = exog_base.copy()
exog_cf["stimulus"] = 0.0
pred_base = res_x.get_forecast(steps=h, exog=exog_base).predicted_mean
pred_cf = res_x.get_forecast(steps=h, exog=exog_cf).predicted_mean
effect = pred_base - pred_cf
fig = go.Figure()
fig.add_trace(
go.Scatter(x=train.index, y=train["y"], name="train y", line=dict(color="rgba(0,0,0,0.35)"))
)
fig.add_trace(go.Scatter(x=test.index, y=test["y"], name="test y", line=dict(color="black")))
fig.add_trace(
go.Scatter(
x=test.index,
y=pred_base,
name="baseline forecast (observed exog)",
line=dict(color="#4E79A7"),
)
)
fig.add_trace(
go.Scatter(
x=test.index,
y=pred_cf,
name="counterfactual (stimulus=0)",
line=dict(color="#E15759", dash="dash"),
)
)
fig.add_vline(x=test.index[0], line_dash="dash", line_color="black")
fig.update_layout(
title="Counterfactual-style forecast: set stimulus=0 in the horizon",
xaxis_title="Date",
yaxis_title="y",
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0),
)
fig.show()
go.Figure(
data=[go.Bar(x=test.index, y=effect, name="baseline - counterfactual")],
layout=go.Layout(
title="Estimated stimulus effect in the forecast horizon (model-based)",
xaxis_title="Date",
yaxis_title="Δy",
),
).show()
effect.describe()
count 24.0
mean 0.0
std 0.0
min 0.0
25% 0.0
50% 0.0
75% 0.0
max 0.0
Name: predicted_mean, dtype: float64
5) Low-level NumPy implementation (educational approximation)#
A full SARIMAX fit typically uses maximum likelihood via a state-space model and the Kalman filter (this is what statsmodels does).
To keep the mechanics transparent, the next section implements a simple two-stage approximation:
Fit the regression \(y_t \approx \beta_0 + \mathbf{x}_t^\top\boldsymbol{\beta}\) by OLS.
Fit a seasonal AR model on the residuals:
Forecast by combining the regression mean and the recursive residual forecast.
This corresponds to an ARIMAX-like model with no MA terms and no joint estimation; it is useful for intuition, not as a drop-in replacement for SARIMAX MLE.
def add_intercept(X: np.ndarray) -> np.ndarray:
return np.column_stack([np.ones((X.shape[0], 1)), X])
def fit_ols(X: np.ndarray, y: np.ndarray) -> np.ndarray:
beta, *_ = np.linalg.lstsq(X, y, rcond=None)
return beta
def fit_seasonal_ar(residuals: np.ndarray, lags: list[int]) -> np.ndarray:
max_lag = max(lags)
y = residuals[max_lag:]
X = np.column_stack([residuals[max_lag - lag : -lag] for lag in lags])
params, *_ = np.linalg.lstsq(X, y, rcond=None)
return params
def forecast_seasonal_ar(
residuals_history: np.ndarray,
ar_params: np.ndarray,
lags: list[int],
steps: int,
) -> np.ndarray:
max_lag = max(lags)
hist = residuals_history.astype(float).copy()
if hist.shape[0] < max_lag:
raise ValueError("Need at least max(lags) residual history values.")
preds: list[float] = []
for _ in range(steps):
t = hist.shape[0]
r_hat = 0.0
for coef, lag in zip(ar_params, lags):
r_hat += float(coef) * float(hist[t - lag])
preds.append(r_hat)
hist = np.append(hist, r_hat)
return np.asarray(preds)
# Stage 1: OLS regression
X_train = add_intercept(train[exog_cols].to_numpy())
y_train = train["y"].to_numpy()
beta_ols = fit_ols(X_train, y_train)
resid_train = y_train - X_train @ beta_ols
# Stage 2: seasonal AR on residuals (lags 1 and 12)
lags = [1, 12]
ar_params = fit_seasonal_ar(resid_train, lags)
# Forecast residuals and combine with regression mean
resid_fcst = forecast_seasonal_ar(resid_train, ar_params, lags, steps=h)
X_test = add_intercept(test[exog_cols].to_numpy())
pred_numpy = X_test @ beta_ols + resid_fcst
print("OLS betas (intercept, stimulus, rate):", beta_ols)
print("Seasonal AR params (lag 1, lag 12):", ar_params)
print("Test RMSE (NumPy two-stage):", rmse(test["y"], pred_numpy))
fig = go.Figure()
fig.add_trace(go.Scatter(x=test.index, y=test["y"], name="test y", line=dict(color="black")))
fig.add_trace(go.Scatter(x=test.index, y=pred_x, name="SARIMAX forecast", line=dict(color="#4E79A7")))
fig.add_trace(
go.Scatter(
x=test.index,
y=pred_numpy,
name="NumPy two-stage",
line=dict(color="#59A14F", dash="dash"),
)
)
fig.update_layout(
title="Holdout forecast: statsmodels SARIMAX vs NumPy two-stage approximation",
xaxis_title="Date",
yaxis_title="y",
)
fig.show()
OLS betas (intercept, stimulus, rate): [51.76806591 7.28392131 -5.13479592]
Seasonal AR params (lag 1, lag 12): [0.51265418 0.20647124]
Test RMSE (NumPy two-stage): 1.6113110846360617
Pitfalls + diagnostics#
Leakage (alignment errors): if \(x_t\) includes information from the future (even subtly via rolling features), you will overestimate performance.
Endogeneity: in many finance/econ settings, \(x\) and \(y\) influence each other. SARIMAX can forecast, but coefficients can be misleading as “effects”.
Collinearity: correlated regressors inflate variance in \(\hat{\beta}\); interpret with caution.
Forecasting exog: SARIMAX needs future \(\mathbf{x}_{t+h}\); if you can’t know it, you must model it or do scenario ranges.
Residual checks: inspect residual ACF/PACF and run stationarity tests on residuals; remaining structure suggests mis-specified orders or missing regressors.
Order selection: start simple (small p/q/P/Q) and validate on rolling-origin backtests.
Exercises + references#
Exercises:
Add a lagged version of
rate(e.g.,rate.shift(1)) and see how coefficients and forecasts change.Introduce a trend in
rateand compare: include a trend term vs differencing.Replace the binary
stimuluswith a continuous “spend” variable and examine scaling/standardization.Do a rolling backtest and compare SARIMA vs SARIMAX stability over time.
References:
statsmodels.tsa.statespace.SARIMAXdocumentation (state-space + Kalman filter implementation).Box, Jenkins, Reinsel & Ljung: Time Series Analysis: Forecasting and Control (classic ARIMA/SARIMA reference).